# Using the Python API

Lighteval can be used from a custom Python script. To evaluate a model, you will need to set up an
[`~logging.evaluation_tracker.EvaluationTracker`], [`~pipeline.PipelineParameters`],
a [`model`](package_reference/models) or a [`model_config`](package_reference/model_config),
and a [`~pipeline.Pipeline`].

After that, simply run the pipeline and save the results.

```python
import lighteval
from lighteval.logging.evaluation_tracker import EvaluationTracker
from lighteval.models.vllm.vllm_model import VLLMModelConfig
from lighteval.pipeline import ParallelismManager, Pipeline, PipelineParameters
from lighteval.utils.imports import is_package_available

if is_package_available("accelerate"):
    from datetime import timedelta
    from accelerate import Accelerator, InitProcessGroupKwargs
    accelerator = Accelerator(kwargs_handlers=[InitProcessGroupKwargs(timeout=timedelta(seconds=3000))])
else:
    accelerator = None

def main():
    evaluation_tracker = EvaluationTracker(
        output_dir="./results",
        save_details=True,
        push_to_hub=True,
        hub_results_org="your_username",  # Replace with your actual username
    )

    pipeline_params = PipelineParameters(
        launcher_type=ParallelismManager.ACCELERATE,
        custom_tasks_directory=None,  # Set to path if using custom tasks
        # Remove the parameter below once your configuration is tested
        max_samples=10
    )

    model_config = VLLMModelConfig(
        model_name="HuggingFaceH4/zephyr-7b-beta",
        dtype="float16",
    )

    task = "gsm8k|5"

    pipeline = Pipeline(
        tasks=task,
        pipeline_parameters=pipeline_params,
        evaluation_tracker=evaluation_tracker,
        model_config=model_config,
    )

    pipeline.evaluate()
    pipeline.save_and_push_results()
    pipeline.show_results()

if __name__ == "__main__":
    main()
```

## Key Components

### EvaluationTracker
The `EvaluationTracker` handles logging and saving evaluation results. It can save results locally and optionally push them to the Hugging Face Hub.

### PipelineParameters
`PipelineParameters` configures how the evaluation pipeline runs, including parallelism settings and task configuration.

### Model Configuration
Model configurations define the model to be evaluated, including the model name, data type, and other model-specific parameters. Different backends (VLLM, Transformers, etc.) have their own configuration classes.

### Pipeline
The `Pipeline` orchestrates the entire evaluation process, taking the tasks, model configuration, and parameters to run the evaluation.

## Running Multiple Tasks

You can evaluate on multiple tasks by providing a comma-separated list or a file path:

```python
# Multiple tasks as comma-separated string
tasks = "aime24,aime25"

# Or load from a file
tasks = "./path/to/tasks.txt"

pipeline = Pipeline(
    tasks=tasks,
    # ... other parameters
)
```

## Custom Tasks

To use custom tasks, set the `custom_tasks_directory` parameter to the path containing your custom task definitions:

```python
pipeline_params = PipelineParameters(
    custom_tasks_directory="./path/to/custom/tasks",
    # ... other parameters
)
```

For more information on creating custom tasks, see the [Adding a Custom Task](adding-a-custom-task) guide.
